Add support for ground bv2nat and int2bv reasoning #733

bclement-ocp · 2023-07-12T08:41:01Z

This patch adds support for "combined" reasoning about the int2bv and bv2nat symbols in the arith and bitv reasoners.

More specifically:

The constructors for bv2nat and int2bv in expr.ml are now "smart" and know that they are related, so that we effectivly have the following rewrite rules:

(bv2nat ((_ int2bv n) x)) -> (mod x (pow 2 n))
((_ int2bv n) (bv2nat (as x (_ BitVec m))))
-> (ite (> m n) (extract (- n 1) 0 x) (zero_extend (- m n) x))

These are added early in the bv2nat and int2bv constructors to ensure that we never actually build ((_ int2bv n) (bv2nat e)) or (bv2nat ((_ int2bv n) e)) terms. This is actualy useful because the simplifications done for bv2nat (see below) are more powerful than what int2bv can recover, and there would be information loss otherwise e.g. for ((_ int2bv n) (bv2nat (concat e1 e2))) which would become

((_ int2bv n) (+ (* (bv2nat e1) (pow 2 m)) (bv2nat e2)))

instead of

(concat e1 e2)

(assuming that e2 is of type (_ BitVec m) and e1 of type (_ BitVec |n - m|)).
The arith and bitv reasoners perform constant propagation of int2bv and bv2nat
The arith reasoner performs the ((_ int2bv n) (bv2nat x)) conversion more aggressivey, i.e. it performs the conversion for terms of the form ((_ int2bv n) e) where the semantic value of e is of the form (bv2nat e')

More complex transformations, such as transforming (int2bv (* 2 (bv2nat e))) into (concat e #b0), are not supported.
The bitv reasoner also performs the (bv2nat ((_ int2bv n) e)) conversion more aggressively, i.e. for terms that reduce to ((_ int2bv n) e) after simplification.
The bitv reasoner performs additional simplifications on bv2nat terms after normalization. For instance, (bv2nat (concat x y)) becomes (+ (* (pow 2 n) (bv2nat x)) (bv2nat y)) when y is a bit-vector of size n, and (bv2nat (extract j i x)) becomes (mod (div (bv2nat x) (pow 2 i)) (pow 2 (+ (- j i) 1)))
Finally, the bitv reasoner requires that the result of bv2nat for a bitvector of size n is in the range 0 ..< 2^n

Combined, this allows exploiting the arith reasoner for a variety of problems mostly involving bitvector arithmetic with constants. In particular, we are able to handle many aritmetic expressions involving bv2nat.

On the other hand, bitvector expressions involving int2bv have more limited support. Indeed, not only does int2bv perform less simplifications, we are currently not able to exploit the relevant equivalence:

int2bv[n](x) = int2bv[n](y) iff x = y (mod 2^n)

This patch adds support for "combined" reasoning about the int2bv and bv2nat symbols in the arith and bitv reasoners. More specifically: - The constructors for bv2nat and int2bv in `expr.ml` are now "smart" and know that they are related, so that we effectivly have the following rewrite rules: `(bv2nat ((_ int2bv n) x))` -> `(mod x (pow 2 n))` `((_ int2bv n) (bv2nat (as x (_ BitVec m))))` -> `(ite (> m n) (extract (- n 1) 0 x) (zero_extend (- m n) x))` These are added early in the bv2nat and int2bv constructors to ensure that we *never* actually build `((_ int2bv n) (bv2nat e))` or `(bv2nat ((_ int2bv n) e))` terms. This is actualy useful because the simplifications done for `bv2nat` (see below) are more powerful than what `int2bv` can recover, and there would be information loss otherwise e.g. for `((_ int2bv n) (bv2nat (concat e1 e2)))` which would become `((_ int2bv n) (+ (* (bv2nat e1) (pow 2 m)) (bv2nat e2)))` instead of `(concat e1 e2)` (assuming that `e2` is of type `(_ BitVec m)` and `e1` of type `(_ BitVec |n - m|)`). - The arith and bitv reasoners perform constant propagation of int2bv and bv2nat - The arith reasoner performs the `((_ int2bv n) (bv2nat x))` conversion more aggressivey, i.e. it performs the conversion for terms of the form `((_ int2bv n) e)` where the semantic value of `e` is of the form `(bv2nat e')` More complex transformations, such as transforming `(int2bv (* 2 (bv2nat e)))` into `(concat e #b0)`, are not supported. - The bitv reasoner also performs the `(bv2nat ((_ int2bv n) e))` conversion more aggressively, i.e. for terms that reduce to `((_ int2bv n) e)` after simplification. - The bitv reasoner performs additional simplifications on `bv2nat` terms after normalization. For instance, `(bv2nat (concat x y))` becomes `(+ (* (pow 2 n) (bv2nat x)) (bv2nat y))` when `y` is a bit-vector of size `n`, and `(bv2nat (extract j i x))` becomes `(mod (div (bv2nat x) (pow 2 i)) (pow 2 (+ (- j i) 1)))` - Finally, the bitv reasoner requires that the result of `bv2nat` for a bitvector of size `n` is in the range `0 ..< 2^n` Combined, this allows exploiting the arith reasoner for a variety of problems mostly involving bitvector arithmetic with constants. In particular, we are able to handle many aritmetic expressions involving `bv2nat`. On the other hand, bitvector expressions involving `int2bv` have more limited support. Indeed, not only does `int2bv` perform less simplifications, we are currently not able to exploit the relevant equivalence: `int2bv[n](x) = int2bv[n](y) iff x = y (mod 2^n)`

bclement-ocp · 2023-07-12T13:15:40Z

Note: I am re-running benches on QF_BV. There was an issue with the old canonizer where we could have bv terms as aliens; the new canonizer (from #732 ) no longer has that issue.

But the tests are still very slow b/c there are many timeouts… there are 42k tests, and if they all time out, if would take 4h to run all the tests. In practice it looks like it's more around half of them timing out?

bclement-ocp · 2023-07-12T19:45:09Z

This is +1272-1843 which I assume is mostly due to #736 but still is a big drop.

Although there is also a relatively high number of segmentation faults (I tracked one to Z.to_string from ZArith on marvin, but couldn't reproduce locally) so… looks like something fishy is going on, maybe with the gmp version on marvin?

bclement-ocp · 2023-07-12T21:01:22Z

Although there is also a relatively high number of segmentation faults (I tracked one to Z.to_string from ZArith on marvin, but couldn't reproduce locally) so… looks like something fishy is going on, maybe with the gmp version on marvin?

Turns out the Z.to_string issue was spurious and the actual is a stack overflow because `bv2nat2 is not an bitvector but an integer and we were looping. Rerunning the tests.

bclement-ocp · 2023-07-13T07:57:26Z

This is now +1541-1693 (still on QF_BV) which is more acceptable, but we are still losing a lot. Maybe related to #736? On the other hand, we want to compute on ground problems!

(Note that, now that I think about it, this is probably a big part of the issue --- due to technical limitations we are currenty only able to simplify e.g. (bvadd #b01 #b01) into #b10 but not (bvadd x x) into #b10 in a context where we know x = #b01)

bclement-ocp · 2023-07-13T07:59:35Z

(Note that, now that I think about it, this is probably a big part of the issue --- due to technical limitations we are currenty only able to simplify e.g. (bvadd #b01 #b01) into #b10 but not (bvadd x x) into #b10 in a context where we know x = #b01)

Which I just figured out how to do, will try to make a PR soon.

bclement-ocp · 2023-07-13T08:43:47Z

(Note that, now that I think about it, this is probably a big part of the issue --- due to technical limitations we are currenty only able to simplify e.g. (bvadd #b01 #b01) into #b10 but not (bvadd x x) into #b10 in a context where we know x = #b01)

Which I just figured out how to do, will try to make a PR soon.

Well, actually no, because my solution would break solve.

Halbaroth · 2023-07-19T13:33:22Z

Can I merge this PR?

bclement-ocp · 2023-07-19T14:39:54Z

Yes let's merge this after our discussion

It turns out that OCamlPro#881 revealed a latent issue with the implementation in OCamlPro#733 When we encounter a [bv2nat] term, we create a *term* that represents the [bv2nat] computation, then call [X.make] on the result. This has the unfortunate effect that the sub-terms of that new term are not added to the Union-Find (or at least not before the term itself). This is bad, but was harmless at the time. However, the resulting term from a call of [bv2nat] includes divisions. And since OCamlPro#881 divisions are represented by an uninterpreted term rather than an existential variable — and [IntervalCalculus.add] inspects that term, rightfully expecting that its sub-terms have been added to the Union-Find, which in this case they are not, and so we crash. This patch fixes the issue by introducing an uninterpreted term for the result of [bv2nat] and adding the equality between that uninterpreted term and the [bv2nat] computation. This equality is then processed as a term on its own, ensuring that subterms are added to the Union-Find before the terms that contain them. Note that long-term, we probably want to get rid of the construction of [bv2nat] terms directly in [X.make] in favor of propagators in a follow-up to OCamlPro#944 . This would avoid needing to go through terms for this.

It turns out that #881 revealed a latent issue with the implementation in #733 When we encounter a [bv2nat] term, we create a *term* that represents the [bv2nat] computation, then call [X.make] on the result. This has the unfortunate effect that the sub-terms of that new term are not added to the Union-Find (or at least not before the term itself). This is bad, but was harmless at the time. However, the resulting term from a call of [bv2nat] includes divisions. And since #881 divisions are represented by an uninterpreted term rather than an existential variable — and [IntervalCalculus.add] inspects that term, rightfully expecting that its sub-terms have been added to the Union-Find, which in this case they are not, and so we crash. This patch fixes the issue by introducing an uninterpreted term for the result of [bv2nat] and adding the equality between that uninterpreted term and the [bv2nat] computation. This equality is then processed as a term on its own, ensuring that subterms are added to the Union-Find before the terms that contain them. Note that long-term, we probably want to get rid of the construction of [bv2nat] terms directly in [X.make] in favor of propagators in a follow-up to #944 . This would avoid needing to go through terms for this.

hra687261 approved these changes Jul 12, 2023

View reviewed changes

bclement-ocp force-pushed the bclement/bvint branch from b7ba6c8 to b89c481 Compare July 12, 2023 13:15

bclement-ocp added 2 commits July 12, 2023 22:57

Use Z.(~$1 lsl n) rather than Z.(pow ~$2 n)

df14d0f

Fix infinite loop: bv2nat has type int

be3da73

Halbaroth added the enhancement label Jul 18, 2023

Halbaroth self-assigned this Jul 18, 2023

Halbaroth added this to the 2.5.0 milestone Jul 18, 2023

bclement-ocp mentioned this pull request Jul 18, 2023

[Bitv] Add support for the SMT-LIB2's BV primitives #669

Closed

bclement-ocp merged commit 4bf5b1a into OCamlPro:next Jul 19, 2023
9 of 10 checks passed

bclement-ocp mentioned this pull request Nov 20, 2023

fix: Do not call [X.make] on non-subterms for bv2nat #954

Merged

bclement-ocp deleted the bclement/bvint branch January 3, 2024 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for ground bv2nat and int2bv reasoning #733

Add support for ground bv2nat and int2bv reasoning #733

bclement-ocp commented Jul 12, 2023

bclement-ocp commented Jul 12, 2023

bclement-ocp commented Jul 12, 2023

bclement-ocp commented Jul 12, 2023

bclement-ocp commented Jul 13, 2023

bclement-ocp commented Jul 13, 2023

bclement-ocp commented Jul 13, 2023

Halbaroth commented Jul 19, 2023

bclement-ocp commented Jul 19, 2023

Add support for ground bv2nat and int2bv reasoning #733

Add support for ground bv2nat and int2bv reasoning #733

Conversation

bclement-ocp commented Jul 12, 2023

bclement-ocp commented Jul 12, 2023

bclement-ocp commented Jul 12, 2023

bclement-ocp commented Jul 12, 2023

bclement-ocp commented Jul 13, 2023

bclement-ocp commented Jul 13, 2023

bclement-ocp commented Jul 13, 2023

Halbaroth commented Jul 19, 2023

bclement-ocp commented Jul 19, 2023